Skip to content

Fix #1692: Null out fields after use in lazy initialization #4289

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 9, 2018

Conversation

allanrenucci
Copy link
Contributor

@allanrenucci allanrenucci commented Apr 10, 2018

Fix #1692.
Private fields that are only used during lazy val initialization can be assigned null once the lazy val is initialized. This is not just an optimization, but is needed for correctness to prevent memory leaks.

@allanrenucci
Copy link
Contributor Author

test performance please

@dottybot
Copy link
Member

performance test scheduled: 1 job(s) in queue, 0 running.

@dottybot
Copy link
Member

Performance test finished successfully:

Visit http://dotty-bench.epfl.ch/4289/ to see the changes.

Benchmarks is based on merging with master (2762567)

* - its type is nullable, or is an expression type (e.g. => Int)
* - is on used in a lazy val initializer
* - defined in the same class as the lazy val
* - TODO from Scalac? from a non-trait class
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@retronym Scalac doesn't null out singly used private fields after use in lazy initialization if they are defined in a trait: https://github.com/scala/scala/blob/b3e380a08e4f8781f89412c4921f380a4b4758e9/src/compiler/scala/tools/nsc/transform/Mixin.scala#L381
What is the reason for this restriction?

@allanrenucci allanrenucci changed the title [WIP] Fix #1692: Null out fields after use in lazy initialization Fix #1692: Null out fields after use in lazy initialization Apr 11, 2018
@allanrenucci allanrenucci requested a review from smarter April 11, 2018 16:05
val name = "collectNullableFields"
}

/** Collect fields that can be null out after use in lazy initialization.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be null -> can be nulled

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same typo here

Copy link
Contributor

@OlivierBlanvillain OlivierBlanvillain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👌

val name = "collectNullableFields"
}

/** Collect fields that can be null out after use in lazy initialization.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same typo here

* fields that are only used within a lazy val initializer. This is not just an optimization,
* but is needed for correctness to prevent memory leaks. E.g.
*
* {{{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should a markdown code block instead

ref(field).becomes(nullConst)
}
}

/** Create non-threadsafe lazy accessor equivalent to such code
* def methodSymbol() = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation is off... While you're at it, could you also add the missing code blocks in this file?

val from = ctx.owner
val isNullable =
from.is(Lazy) && from.isField && // used in lazy field initializer
from.owner.eq(sym.owner) // lazy val and field in the same class
Copy link
Contributor

@OlivierBlanvillain OlivierBlanvillain Apr 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this cover cases like this:?

class TestByNameLazy(byNameMsg: => String) {
  lazy val byLazyValMsg = {
    def foo = byNameMsg
    foo
  }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I am being conservative here (so is Scalac).

In this case it is Ok to null out byNameMsg once byLazyValMsg is initialised. But there are cases where a reference to byNameMsg might leak out of the lazy val initialiser. For example:

trait Foo { def bar: String }
class TestByNameLazy(byNameMsg: => String) {
  lazy val lazyFoo = {
    def foo = byNameMsg
    new Foo { def bar = foo }
  }

 lazyFoo.bar // not Ok to null out byNameMsg
}

I haven't put too much thought into figuring out when it is safe to null out byNameMsg, but I believe requiring that the lazy val and the field are defined in the same class, and when using the field the owner is the lazy val is restrictive enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me!

case _ =>
}

result.mapValues(_.toList).toMap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could return a Map[Symbol, Seq[Symbol]] to avoid this conversion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will change it to collection.Map[Symbol, List[Symbol]] to avoid the conversion from mutable map to immutable map. The conversion from ListBuffer to List is pretty much free since mapValues return a view on the collection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I took a different approach. I return a mutable map so that LazyVals can clears entries that are not needed anymore

@@ -73,6 +73,7 @@ class Compiler {
new LiftTry, // Put try expressions that might execute on non-empty stacks into their own methods
new HoistSuperArgs, // Hoist complex arguments of supercalls to enclosing scope
new ClassOf, // Expand `Predef.classOf` calls.
new CollectNullableFields, // Collect fields that can be null out after use in lazy initialization
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to do this so early? I think the closer the collection is to the LazyVals phase, the better. It cannot be in the same group as LazyVals or the group before since that's Erasure, but it could be in the ErasedDecls group, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried, but I think I need to run before Getters. Consider the following example:

class LazyNotNullable {
  private val a = 'A'.toInt // not nullable type
  lazy val l0 = a
}

If I run after getter, I only see a as def a: => Int. But then, how can I differentiate between this a and the following:

class LazyNullable(a: => Int) { 
  lazy val l0 = a
}

The first a is not nullable, the second is. Any suggestions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you run after ElimByName then all vals with type => Int will have been replaced by vals with type () => Int, so you don't need to null anything with ExprType.



override def prepareForUnit(tree: Tree)(implicit ctx: Context) = {
lazyValNullables = ctx.collectNullableFieldsPhase.asInstanceOf[CollectNullableFields].lazyValNullables
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we should clear() this map once we're done with it, this can be done by overriding transformUnit.

private[this] var nullability: IdentityHashMap[Symbol, FieldInfo] = _

override def prepareForUnit(tree: Tree)(implicit ctx: Context) = {
nullability = new IdentityHashMap
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will do the wrong thing if you have more than one unit, by the time you get to the group where LazyVals is present, this method has been called for each unit, so all maps but the last one are forgotten.

@liufengyun
Copy link
Contributor

test performance with #fast please

@dottybot
Copy link
Member

performance test scheduled: 1 job(s) in queue, 1 running.

@dottybot
Copy link
Member

Performance test finished successfully:

Visit http://dotty-bench.epfl.ch/4289/ to see the changes.

Benchmarks is based on merging with master (c28ea78)

@allanrenucci allanrenucci force-pushed the fix-1692 branch 3 times, most recently from 31fb8b6 to 968b7de Compare May 1, 2018 16:30
@allanrenucci
Copy link
Contributor Author

The optimisation triggers twice in the Dotty code base:

@allanrenucci allanrenucci force-pushed the fix-1692 branch 2 times, most recently from 1df91db to a56c5ae Compare May 1, 2018 16:39
private[this] var lazyValNullables: IdentityHashMap[Symbol, mutable.ListBuffer[Symbol]] = _
private def nullableFor(sym: Symbol)(implicit ctx: Context) = {
val nullables = lazyValNullables.remove(sym)
if (nullables == null || sym.is(Flags.Module)) Nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the Module check is needed, add a comment explaining why

private[this] case class Nullable(by: Symbol) extends FieldInfo

/** Whether or not a field is nullable */
private[this] var nullability: IdentityHashMap[Symbol, FieldInfo] = _
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use MutableSymbolMap instead.

/** Running after `ElimByName` to see by names as nullable types.
*
* We don't necessary need to run after `Getters`, but the implementation
* could be simplified if we were to run before.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment isn't very clear, what simplifications?

Private fields that are only used during lazy val initialization can be
assigned null once the lazy val is initialized. This is not just an
optimization, but is needed for correctness to prevent memory leaks.
@allanrenucci allanrenucci merged commit 9f110e2 into scala:master May 9, 2018
@allanrenucci allanrenucci deleted the fix-1692 branch May 9, 2018 18:53
@smarter smarter mentioned this pull request Feb 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants